Hyponym Extraction from the Web based on Property Inheritance of Text and Image Features

نویسنده

  • Shun Hattori
چکیده

Concept hierarchy knowledge, such as hyponymy and meronymy, is very important for various Natural Language Processing systems. While WordNet and Wikipedia are being manually constructed and maintained as lexical ontologies, many researchers have tackled how to extract concept hierarchies from very large corpora of text documents such as the Web not manually but automatically. However, their methods are mostly based on lexico-syntactic patterns as not necessary but sufficient conditions of hyponymy and meronymy, so they can achieve high precision but low recall when using stricter patterns or they can achieve high recall but low precision when using looser patterns. Therefore, we need necessary conditions of hyponymy and meronymy to achieve high recall and not low precision. The previous papers have assumed “Property Inheritance” from a target concept to its hyponyms and/or “Property Aggregation” from its hyponyms to the target concept to be necessary and sufficient conditions of hyponymy, and proposed several methods to extract hyponymy relations from the Web based on property inheritance and/or property aggregation of text features such as meronyms and behavior. This paper proposes a method to acquire hyponymy relations from the Web based on property inheritance of not only text features but also image features for each conceptual word. Keywords-hyponymy; meronymy; concept hierarchy; Web mining; image analysis; property inheritance; typical image;

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

Image retrieval using the combination of text-based and content-based algorithms

Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...

متن کامل

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Object-oriented Semantic and Sensory Knowledge Extraction from the Web

Automatic knowledge extraction from such a very large document corpus as the Web is one of the hottest research topics in the domain of Artificial Intelligence and Database technologies. This chapter introduces my object-oriented and the existing methods to extract semantic (e.g., hyponymy and meronymy) and sensory (e.g., visual and aural) knowledge from the Web, and compares them by showing se...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012